Dataset statistics
| Number of variables | 27 |
|---|---|
| Number of observations | 110148 |
| Missing cells | 478 |
| Missing cells (%) | < 0.1% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 22.7 MiB |
| Average record size in memory | 216.0 B |
Variable types
| NUM | 11 |
|---|---|
| CAT | 10 |
| BOOL | 5 |
| DATE | 1 |
Reproduction
| Analysis started | 2020-07-27 09:59:44.382778 |
|---|---|
| Analysis finished | 2020-07-27 10:00:25.619042 |
| Duration | 41.24 seconds |
| Version | pandas-profiling v2.8.0 |
| Command line | pandas_profiling --config_file config.yaml [YOUR_FILE.csv] |
| Download configuration | config.yaml |
to_young_warn has constant value "373" | Constant |
app_month is highly correlated with client_id and 1 other fields | High correlation |
client_id is highly correlated with app_month and 1 other fields | High correlation |
days_from_CD is highly correlated with client_id and 1 other fields | High correlation |
client_id has unique values | Unique |
decline_app_cnt has 91471 (83.0%) zeros | Zeros |
bki_request_cnt has 28908 (26.2%) zeros | Zeros |
| Distinct count | 110148 |
|---|---|
| Unique (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 55074.5 |
|---|---|
| Minimum | 1 |
| Maximum | 110148 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 860.5 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 5508.35 |
| Q1 | 27537.75 |
| median | 55074.5 |
| Q3 | 82611.25 |
| 95-th percentile | 104640.65 |
| Maximum | 110148 |
| Range | 110147 |
| Interquartile range (IQR) | 55073.5 |
Descriptive statistics
| Standard deviation | 31797.13306 |
|---|---|
| Coefficient of variation (CV) | 0.5773476484 |
| Kurtosis | -1.2 |
| Mean | 55074.5 |
| Median Absolute Deviation (MAD) | 27537 |
| Skewness | 0 |
| Sum | 6066346026 |
| Variance | 1011057671 |
| Value | Count | Frequency (%) | |
| 2047 | 1 | < 0.1% | |
| 97541 | 1 | < 0.1% | |
| 93447 | 1 | < 0.1% | |
| 70920 | 1 | < 0.1% | |
| 72969 | 1 | < 0.1% | |
| 66826 | 1 | < 0.1% | |
| 68875 | 1 | < 0.1% | |
| 79116 | 1 | < 0.1% | |
| 81165 | 1 | < 0.1% | |
| 75022 | 1 | < 0.1% | |
| Other values (110138) | 110138 | > 99.9% |
| Value | Count | Frequency (%) | |
| 1 | 1 | < 0.1% | |
| 2 | 1 | < 0.1% | |
| 3 | 1 | < 0.1% | |
| 4 | 1 | < 0.1% | |
| 5 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 110148 | 1 | < 0.1% | |
| 110147 | 1 | < 0.1% | |
| 110146 | 1 | < 0.1% | |
| 110145 | 1 | < 0.1% | |
| 110144 | 1 | < 0.1% |
app_date
Date
| Distinct count | 120 |
|---|---|
| Unique (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 860.5 KiB |
| Minimum | 2014-01-01 00:00:00 |
|---|---|
| Maximum | 2014-04-30 00:00:00 |
education
Categorical
| Distinct count | 5 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 478 |
| Missing (%) | 0.4% |
| Memory size | 860.5 KiB |
| SCH | |
|---|---|
| GRD | |
| UGR | |
| PGR | 1865 |
| ACD | 291 |
| Value | Count | Frequency (%) | |
| SCH | 57998 | 52.7% | |
| GRD | 34768 | 31.6% | |
| UGR | 14748 | 13.4% | |
| PGR | 1865 | 1.7% | |
| ACD | 291 | 0.3% | |
| (Missing) | 478 | 0.4% |
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
sex
Categorical
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 860.5 KiB |
| F | |
|---|---|
| M |
| Value | Count | Frequency (%) | |
| F | 61836 | 56.1% | |
| M | 48312 | 43.9% |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
age
Real number (ℝ≥0)
| Distinct count | 52 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 39.249409884882155 |
|---|---|
| Minimum | 21 |
| Maximum | 72 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 860.5 KiB |
Quantile statistics
| Minimum | 21 |
|---|---|
| 5-th percentile | 24 |
| Q1 | 30 |
| median | 37 |
| Q3 | 48 |
| 95-th percentile | 60 |
| Maximum | 72 |
| Range | 51 |
| Interquartile range (IQR) | 18 |
Descriptive statistics
| Standard deviation | 11.51806263 |
|---|---|
| Coefficient of variation (CV) | 0.2934582371 |
| Kurtosis | -0.7260121183 |
| Mean | 39.24940988 |
| Median Absolute Deviation (MAD) | 9 |
| Skewness | 0.4802480831 |
| Sum | 4323244 |
| Variance | 132.6657668 |
| Value | Count | Frequency (%) | |
| 31 | 4084 | 3.7% | |
| 28 | 4035 | 3.7% | |
| 30 | 4035 | 3.7% | |
| 27 | 3964 | 3.6% | |
| 29 | 3940 | 3.6% | |
| 26 | 3780 | 3.4% | |
| 32 | 3773 | 3.4% | |
| 34 | 3548 | 3.2% | |
| 33 | 3499 | 3.2% | |
| 35 | 3386 | 3.1% | |
| Other values (42) | 72104 | 65.5% |
| Value | Count | Frequency (%) | |
| 21 | 1262 | 1.1% | |
| 22 | 1415 | 1.3% | |
| 23 | 2295 | 2.1% | |
| 24 | 2780 | 2.5% | |
| 25 | 3292 | 3.0% |
| Value | Count | Frequency (%) | |
| 72 | 2 | < 0.1% | |
| 71 | 6 | < 0.1% | |
| 70 | 60 | 0.1% | |
| 69 | 110 | 0.1% | |
| 68 | 261 | 0.2% |
car
Boolean
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 860.5 KiB |
| N | |
|---|---|
| Y |
| Value | Count | Frequency (%) | |
| N | 74290 | 67.4% | |
| Y | 35858 | 32.6% |
car_type
Boolean
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 860.5 KiB |
| N | |
|---|---|
| Y |
| Value | Count | Frequency (%) | |
| N | 89140 | 80.9% | |
| Y | 21008 | 19.1% |
| Distinct count | 24 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.2732051421723499 |
|---|---|
| Minimum | 0 |
| Maximum | 33 |
| Zeros | 91471 |
| Zeros (%) | 83.0% |
| Memory size | 860.5 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 2 |
| Maximum | 33 |
| Range | 33 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 0.799099319 |
|---|---|
| Coefficient of variation (CV) | 2.924905851 |
| Kurtosis | 101.2380998 |
| Mean | 0.2732051422 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 6.493006696 |
| Sum | 30093 |
| Variance | 0.6385597216 |
| Value | Count | Frequency (%) | |
| 0 | 91471 | 83.0% | |
| 1 | 12500 | 11.3% | |
| 2 | 3622 | 3.3% | |
| 3 | 1365 | 1.2% | |
| 4 | 606 | 0.6% | |
| 5 | 255 | 0.2% | |
| 6 | 156 | 0.1% | |
| 7 | 58 | 0.1% | |
| 8 | 37 | < 0.1% | |
| 9 | 29 | < 0.1% | |
| Other values (14) | 49 | < 0.1% |
| Value | Count | Frequency (%) | |
| 0 | 91471 | 83.0% | |
| 1 | 12500 | 11.3% | |
| 2 | 3622 | 3.3% | |
| 3 | 1365 | 1.2% | |
| 4 | 606 | 0.6% |
| Value | Count | Frequency (%) | |
| 33 | 1 | < 0.1% | |
| 30 | 1 | < 0.1% | |
| 24 | 1 | < 0.1% | |
| 22 | 1 | < 0.1% | |
| 21 | 1 | < 0.1% |
good_work
Boolean
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 860.5 KiB |
| 0 | |
|---|---|
| 1 | 18231 |
| Value | Count | Frequency (%) | |
| 0 | 91917 | 83.4% | |
| 1 | 18231 | 16.6% |
score_bki
Real number (ℝ)
| Distinct count | 102618 |
|---|---|
| Unique (%) | 93.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | -1.904535048828939 |
|---|---|
| Minimum | -3.62458632 |
| Maximum | 0.19977285 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 860.5 KiB |
Quantile statistics
| Minimum | -3.62458632 |
|---|---|
| 5-th percentile | -2.696247185 |
| Q1 | -2.26043367 |
| median | -1.92082293 |
| Q3 | -1.567888152 |
| 95-th percentile | -1.055049083 |
| Maximum | 0.19977285 |
| Range | 3.82435917 |
| Interquartile range (IQR) | 0.6925455175 |
Descriptive statistics
| Standard deviation | 0.4993974924 |
|---|---|
| Coefficient of variation (CV) | -0.2622149131 |
| Kurtosis | -0.1492918934 |
| Mean | -1.904535049 |
| Median Absolute Deviation (MAD) | 0.34574209 |
| Skewness | 0.1939872976 |
| Sum | -209780.7266 |
| Variance | 0.2493978554 |
| Value | Count | Frequency (%) | |
| -1.77526279 | 517 | 0.5% | |
| -2.1042109 | 454 | 0.4% | |
| -2.22500363 | 424 | 0.4% | |
| -2.16966378 | 375 | 0.3% | |
| -2.02410005 | 278 | 0.3% | |
| -1.92082293 | 270 | 0.2% | |
| -2.38726804 | 238 | 0.2% | |
| -1.52642194 | 207 | 0.2% | |
| -2.44723899 | 207 | 0.2% | |
| -2.2729409 | 176 | 0.2% | |
| Other values (102608) | 107002 | 97.1% |
| Value | Count | Frequency (%) | |
| -3.62458632 | 1 | < 0.1% | |
| -3.59798083 | 1 | < 0.1% | |
| -3.58258691 | 1 | < 0.1% | |
| -3.57419708 | 1 | < 0.1% | |
| -3.56422406 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 0.19977285 | 2 | < 0.1% | |
| 0.1980699 | 1 | < 0.1% | |
| 0.18882044 | 1 | < 0.1% | |
| 0.18361297 | 1 | < 0.1% | |
| 0.16854933 | 1 | < 0.1% |
| Distinct count | 40 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2.0050023604604714 |
|---|---|
| Minimum | 0 |
| Maximum | 53 |
| Zeros | 28908 |
| Zeros (%) | 26.2% |
| Memory size | 860.5 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 1 |
| Q3 | 3 |
| 95-th percentile | 6 |
| Maximum | 53 |
| Range | 53 |
| Interquartile range (IQR) | 3 |
Descriptive statistics
| Standard deviation | 2.266925867 |
|---|---|
| Coefficient of variation (CV) | 1.130635012 |
| Kurtosis | 23.16785082 |
| Mean | 2.00500236 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 3.082728152 |
| Sum | 220847 |
| Variance | 5.138952887 |
| Value | Count | Frequency (%) | |
| 0 | 28908 | 26.2% | |
| 1 | 27295 | 24.8% | |
| 2 | 20481 | 18.6% | |
| 3 | 13670 | 12.4% | |
| 4 | 8406 | 7.6% | |
| 5 | 4960 | 4.5% | |
| 6 | 2500 | 2.3% | |
| 7 | 1292 | 1.2% | |
| 8 | 735 | 0.7% | |
| 9 | 459 | 0.4% | |
| Other values (30) | 1442 | 1.3% |
| Value | Count | Frequency (%) | |
| 0 | 28908 | 26.2% | |
| 1 | 27295 | 24.8% | |
| 2 | 20481 | 18.6% | |
| 3 | 13670 | 12.4% | |
| 4 | 8406 | 7.6% |
| Value | Count | Frequency (%) | |
| 53 | 1 | < 0.1% | |
| 47 | 1 | < 0.1% | |
| 46 | 1 | < 0.1% | |
| 45 | 1 | < 0.1% | |
| 41 | 1 | < 0.1% |
region_rating
Real number (ℝ≥0)
| Distinct count | 7 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 56.7511893089298 |
|---|---|
| Minimum | 20 |
| Maximum | 80 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 860.5 KiB |
Quantile statistics
| Minimum | 20 |
|---|---|
| 5-th percentile | 40 |
| Q1 | 50 |
| median | 50 |
| Q3 | 60 |
| 95-th percentile | 80 |
| Maximum | 80 |
| Range | 60 |
| Interquartile range (IQR) | 10 |
Descriptive statistics
| Standard deviation | 13.06592289 |
|---|---|
| Coefficient of variation (CV) | 0.2302317017 |
| Kurtosis | -0.6334345368 |
| Mean | 56.75118931 |
| Median Absolute Deviation (MAD) | 10 |
| Skewness | 0.4778692262 |
| Sum | 6251030 |
| Variance | 170.7183409 |
| Value | Count | Frequency (%) | |
| 50 | 40981 | 37.2% | |
| 60 | 23999 | 21.8% | |
| 40 | 17947 | 16.3% | |
| 80 | 17170 | 15.6% | |
| 70 | 9304 | 8.4% | |
| 30 | 434 | 0.4% | |
| 20 | 313 | 0.3% |
| Value | Count | Frequency (%) | |
| 20 | 313 | 0.3% | |
| 30 | 434 | 0.4% | |
| 40 | 17947 | 16.3% | |
| 50 | 40981 | 37.2% | |
| 60 | 23999 | 21.8% |
| Value | Count | Frequency (%) | |
| 80 | 17170 | 15.6% | |
| 70 | 9304 | 8.4% | |
| 60 | 23999 | 21.8% | |
| 50 | 40981 | 37.2% | |
| 40 | 17947 | 16.3% |
home_address
Categorical
| Distinct count | 3 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 860.5 KiB |
| 2 | |
|---|---|
| 1 | |
| 3 | 1869 |
| Value | Count | Frequency (%) | |
| 2 | 59591 | 54.1% | |
| 1 | 48688 | 44.2% | |
| 3 | 1869 | 1.7% |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
work_address
Categorical
| Distinct count | 3 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 860.5 KiB |
| 3 | |
|---|---|
| 2 | |
| 1 | 12274 |
| Value | Count | Frequency (%) | |
| 3 | 67113 | 60.9% | |
| 2 | 30761 | 27.9% | |
| 1 | 12274 | 11.1% |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
income
Real number (ℝ≥0)
| Distinct count | 1207 |
|---|---|
| Unique (%) | 1.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 41012.648536514505 |
|---|---|
| Minimum | 1000 |
| Maximum | 1000000 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 860.5 KiB |
Quantile statistics
| Minimum | 1000 |
|---|---|
| 5-th percentile | 10000 |
| Q1 | 20000 |
| median | 30000 |
| Q3 | 48000 |
| 95-th percentile | 100000 |
| Maximum | 1000000 |
| Range | 999000 |
| Interquartile range (IQR) | 28000 |
Descriptive statistics
| Standard deviation | 45399.73505 |
|---|---|
| Coefficient of variation (CV) | 1.10696911 |
| Kurtosis | 100.1746159 |
| Mean | 41012.64854 |
| Median Absolute Deviation (MAD) | 12000 |
| Skewness | 7.503020095 |
| Sum | 4517461211 |
| Variance | 2061135943 |
| Value | Count | Frequency (%) | |
| 30000 | 10437 | 9.5% | |
| 25000 | 9090 | 8.3% | |
| 20000 | 8174 | 7.4% | |
| 40000 | 7383 | 6.7% | |
| 50000 | 6742 | 6.1% | |
| 35000 | 6319 | 5.7% | |
| 15000 | 5874 | 5.3% | |
| 60000 | 3818 | 3.5% | |
| 45000 | 3670 | 3.3% | |
| 18000 | 2732 | 2.5% | |
| Other values (1197) | 45909 | 41.7% |
| Value | Count | Frequency (%) | |
| 1000 | 6 | < 0.1% | |
| 1100 | 1 | < 0.1% | |
| 1200 | 1 | < 0.1% | |
| 1500 | 2 | < 0.1% | |
| 1700 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 1000000 | 13 | < 0.1% | |
| 999999 | 4 | < 0.1% | |
| 999000 | 2 | < 0.1% | |
| 990000 | 1 | < 0.1% | |
| 950000 | 4 | < 0.1% |
sna
Categorical
| Distinct count | 4 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 860.5 KiB |
| 1 | |
|---|---|
| 4 | |
| 2 | |
| 3 | 6154 |
| Value | Count | Frequency (%) | |
| 1 | 70681 | 64.2% | |
| 4 | 17481 | 15.9% | |
| 2 | 15832 | 14.4% | |
| 3 | 6154 | 5.6% |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
first_time
Categorical
| Distinct count | 4 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 860.5 KiB |
| 3 | |
|---|---|
| 4 | |
| 1 | |
| 2 |
| Value | Count | Frequency (%) | |
| 3 | 46588 | 42.3% | |
| 4 | 28017 | 25.4% | |
| 1 | 18296 | 16.6% | |
| 2 | 17247 | 15.7% |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
foreign_passport
Boolean
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 860.5 KiB |
| N | |
|---|---|
| Y | 16427 |
| Value | Count | Frequency (%) | |
| N | 93721 | 85.1% | |
| Y | 16427 | 14.9% |
default
Categorical
| Distinct count | 3 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 860.5 KiB |
| 0 | |
|---|---|
| -1 | |
| 1 | 9372 |
| Value | Count | Frequency (%) | |
| 0 | 64427 | 58.5% | |
| -1 | 36349 | 33.0% | |
| 1 | 9372 | 8.5% |
Length
| Max length | 2 |
|---|---|
| Median length | 1 |
| Mean length | 1.330001453 |
| Min length | 1 |
train
Boolean
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 860.5 KiB |
| 1 | |
|---|---|
| 0 |
| Value | Count | Frequency (%) | |
| 1 | 73799 | 67.0% | |
| 0 | 36349 | 33.0% |
car1
Categorical
| Distinct count | 3 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 860.5 KiB |
| 0 | |
|---|---|
| 2 | |
| 1 | 14850 |
| Value | Count | Frequency (%) | |
| 0 | 74290 | 67.4% | |
| 2 | 21008 | 19.1% | |
| 1 | 14850 | 13.5% |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
| Distinct count | 4 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 860.5 KiB |
| 3 | |
|---|---|
| 2 | |
| 4 | |
| 1 |
| Value | Count | Frequency (%) | |
| 3 | 31597 | 28.7% | |
| 2 | 27097 | 24.6% | |
| 4 | 26266 | 23.8% | |
| 1 | 25188 | 22.9% |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
score_bki_income
Real number (ℝ≥0)
| Distinct count | 106475 |
|---|---|
| Unique (%) | 96.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.025660981485143224 |
|---|---|
| Minimum | 0.000521989089 |
| Maximum | 0.7445010830000001 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 860.5 KiB |
Quantile statistics
| Minimum | 0.000521989089 |
|---|---|
| 5-th percentile | 0.006124594341 |
| Q1 | 0.0135169247 |
| median | 0.02093235222 |
| Q3 | 0.03179510595 |
| 95-th percentile | 0.06150379223 |
| Maximum | 0.744501083 |
| Range | 0.7439790939 |
| Interquartile range (IQR) | 0.01827818125 |
Descriptive statistics
| Standard deviation | 0.01991195944 |
|---|---|
| Coefficient of variation (CV) | 0.7759625035 |
| Kurtosis | 70.34170447 |
| Mean | 0.02566098149 |
| Median Absolute Deviation (MAD) | 0.008542532441 |
| Skewness | 4.309870426 |
| Sum | 2826.505789 |
| Variance | 0.0003964861286 |
| Value | Count | Frequency (%) | |
| 0.0307248087 | 55 | < 0.1% | |
| 0.0204832058 | 51 | < 0.1% | |
| 0.0258956394 | 48 | < 0.1% | |
| 0.03236954925 | 46 | < 0.1% | |
| 0.02409667604 | 45 | < 0.1% | |
| 0.02008056337 | 45 | < 0.1% | |
| 0.02109449903 | 44 | < 0.1% | |
| 0.03012084505 | 44 | < 0.1% | |
| 0.0215796995 | 44 | < 0.1% | |
| 0.01618477463 | 43 | < 0.1% | |
| Other values (106465) | 109683 | 99.6% |
| Value | Count | Frequency (%) | |
| 0.000521989089 | 1 | < 0.1% | |
| 0.000536764093 | 1 | < 0.1% | |
| 0.000554029853 | 1 | < 0.1% | |
| 0.000567950726 | 1 | < 0.1% | |
| 0.000569325035 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 0.744501083 | 1 | < 0.1% | |
| 0.677866836 | 1 | < 0.1% | |
| 0.622507259 | 1 | < 0.1% | |
| 0.6213791 | 1 | < 0.1% | |
| 0.589612089 | 1 | < 0.1% |
| Distinct count | 120 |
|---|---|
| Unique (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2130.0285343356213 |
|---|---|
| Minimum | 2072 |
| Maximum | 2191 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 860.5 KiB |
Quantile statistics
| Minimum | 2072 |
|---|---|
| 5-th percentile | 2079 |
| Q1 | 2102 |
| median | 2129 |
| Q3 | 2158 |
| 95-th percentile | 2180 |
| Maximum | 2191 |
| Range | 119 |
| Interquartile range (IQR) | 56 |
Descriptive statistics
| Standard deviation | 32.07607842 |
|---|---|
| Coefficient of variation (CV) | 0.01505899001 |
| Kurtosis | -1.141261894 |
| Mean | 2130.028534 |
| Median Absolute Deviation (MAD) | 28 |
| Skewness | 0.007687821604 |
| Sum | 234618383 |
| Variance | 1028.874807 |
| Value | Count | Frequency (%) | |
| 2115 | 1491 | 1.4% | |
| 2114 | 1363 | 1.2% | |
| 2116 | 1350 | 1.2% | |
| 2102 | 1317 | 1.2% | |
| 2095 | 1296 | 1.2% | |
| 2100 | 1291 | 1.2% | |
| 2122 | 1245 | 1.1% | |
| 2129 | 1242 | 1.1% | |
| 2101 | 1239 | 1.1% | |
| 2150 | 1233 | 1.1% | |
| Other values (110) | 97081 | 88.1% |
| Value | Count | Frequency (%) | |
| 2072 | 865 | 0.8% | |
| 2073 | 546 | 0.5% | |
| 2074 | 878 | 0.8% | |
| 2075 | 497 | 0.5% | |
| 2076 | 643 | 0.6% |
| Value | Count | Frequency (%) | |
| 2191 | 56 | 0.1% | |
| 2190 | 204 | 0.2% | |
| 2189 | 313 | 0.3% | |
| 2188 | 447 | 0.4% | |
| 2187 | 425 | 0.4% |
score_bki_region_rating
Real number (ℝ)
| Distinct count | 104371 |
|---|---|
| Unique (%) | 94.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | -3.470444237018258 |
|---|---|
| Minimum | -15.97430457142857 |
| Maximum | 0.8026158571428571 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 860.5 KiB |
Quantile statistics
| Minimum | -15.97430457 |
|---|---|
| 5-th percentile | -5.550553024 |
| Q1 | -4.23608926 |
| median | -3.375094574 |
| Q3 | -2.60611246 |
| 95-th percentile | -1.664678955 |
| Maximum | 0.8026158571 |
| Range | 16.77692043 |
| Interquartile range (IQR) | 1.629976801 |
Descriptive statistics
| Standard deviation | 1.22549637 |
|---|---|
| Coefficient of variation (CV) | -0.3531237751 |
| Kurtosis | 2.62070108 |
| Mean | -3.470444237 |
| Median Absolute Deviation (MAD) | 0.8101621287 |
| Skewness | -0.7572653356 |
| Sum | -382262.4918 |
| Variance | 1.501841353 |
| Value | Count | Frequency (%) | |
| -4.125903725 | 176 | 0.2% | |
| -3.480907431 | 170 | 0.2% | |
| -4.362752216 | 153 | 0.1% | |
| -2.191682457 | 111 | 0.1% | |
| -4.254242706 | 109 | 0.1% | |
| -3.449526066 | 106 | 0.1% | |
| -2.910266869 | 101 | 0.1% | |
| -3.968823627 | 100 | 0.1% | |
| -4.680917725 | 100 | 0.1% | |
| -3.647546934 | 96 | 0.1% | |
| Other values (104361) | 108926 | 98.9% |
| Value | Count | Frequency (%) | |
| -15.97430457 | 1 | < 0.1% | |
| -15.40041743 | 1 | < 0.1% | |
| -15.38181781 | 1 | < 0.1% | |
| -14.59737719 | 1 | < 0.1% | |
| -14.46257267 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 0.8026158571 | 1 | < 0.1% | |
| 0.4872508537 | 1 | < 0.1% | |
| 0.3917114706 | 1 | < 0.1% | |
| 0.3247047541 | 1 | < 0.1% | |
| 0.3095417049 | 1 | < 0.1% |
delta_app
Real number (ℝ≥0)
| Distinct count | 78548 |
|---|---|
| Unique (%) | 71.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.0977638370944993 |
|---|---|
| Minimum | 0.0 |
| Maximum | 5.254020844302736 |
| Zeros | 1 |
| Zeros (%) | < 0.1% |
| Memory size | 860.5 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0.9731908536 |
| Q1 | 1 |
| median | 1.069635614 |
| Q3 | 1.151407113 |
| 95-th percentile | 1.319390962 |
| Maximum | 5.254020844 |
| Range | 5.254020844 |
| Interquartile range (IQR) | 0.1514071127 |
Descriptive statistics
| Standard deviation | 0.1250967674 |
|---|---|
| Coefficient of variation (CV) | 0.1139559924 |
| Kurtosis | 22.93969712 |
| Mean | 1.097763837 |
| Median Absolute Deviation (MAD) | 0.0696356142 |
| Skewness | 2.306375339 |
| Sum | 120916.4911 |
| Variance | 0.01564920122 |
| Value | Count | Frequency (%) | |
| 1 | 28710 | 26.1% | |
| 1.068083073 | 119 | 0.1% | |
| 1.054321415 | 118 | 0.1% | |
| 1.064386926 | 109 | 0.1% | |
| 1.066389724 | 102 | 0.1% | |
| 1.058775422 | 78 | 0.1% | |
| 1.061935607 | 78 | 0.1% | |
| 1.073048215 | 74 | 0.1% | |
| 1.10864283 | 61 | 0.1% | |
| 1.128773851 | 53 | < 0.1% | |
| Other values (78538) | 80646 | 73.2% |
| Value | Count | Frequency (%) | |
| 0 | 1 | < 0.1% | |
| 0.06262651753 | 1 | < 0.1% | |
| 0.206919472 | 1 | < 0.1% | |
| 0.2469056693 | 1 | < 0.1% | |
| 0.2883295591 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 5.254020844 | 1 | < 0.1% | |
| 4.005621875 | 1 | < 0.1% | |
| 2.773129797 | 1 | < 0.1% | |
| 2.655646902 | 1 | < 0.1% | |
| 2.645422178 | 1 | < 0.1% |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.First rows
| client_id | app_date | education | sex | age | car | car_type | decline_app_cnt | good_work | score_bki | bki_request_cnt | region_rating | home_address | work_address | income | sna | first_time | foreign_passport | default | train | car1 | app_month | score_bki_income | days_from_CD | score_bki_region_rating | delta_app | to_young_warn | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 25905 | 2014-02-01 | SCH | M | 62 | Y | Y | 0 | 0 | -2.008753 | 1 | 50 | 1 | 2 | 18000 | 4 | 1 | N | 0 | 1 | 2 | 2 | 0.034669 | 2160 | -3.938731 | 1.061466 | 373 |
| 1 | 63161 | 2014-03-12 | SCH | F | 59 | N | N | 0 | 0 | -1.532276 | 3 | 50 | 2 | 3 | 19000 | 4 | 1 | N | 0 | 1 | 0 | 3 | 0.035352 | 2121 | -3.004463 | 1.140659 | 373 |
| 2 | 25887 | 2014-02-01 | SCH | M | 25 | Y | N | 2 | 0 | -1.408142 | 1 | 80 | 1 | 2 | 30000 | 1 | 4 | Y | 0 | 1 | 1 | 2 | 0.022803 | 2160 | -1.738447 | 0.956912 | 373 |
| 3 | 16222 | 2014-01-23 | SCH | F | 53 | N | N | 0 | 0 | -2.057471 | 2 | 50 | 2 | 3 | 10000 | 1 | 3 | N | 0 | 1 | 0 | 1 | 0.061917 | 2169 | -4.034258 | 1.125913 | 373 |
| 4 | 101655 | 2014-04-18 | GRD | M | 48 | N | N | 0 | 1 | -1.244723 | 1 | 60 | 2 | 3 | 30000 | 1 | 4 | Y | 0 | 1 | 0 | 4 | 0.023348 | 2084 | -2.040529 | 1.038087 | 373 |
| 5 | 41415 | 2014-02-18 | SCH | M | 27 | Y | N | 0 | 1 | -2.032257 | 0 | 50 | 1 | 1 | 15000 | 2 | 3 | N | 0 | 1 | 1 | 2 | 0.041446 | 2143 | -3.984818 | 1.000000 | 373 |
| 6 | 28436 | 2014-02-04 | SCH | M | 39 | N | N | 0 | 0 | -2.225004 | 0 | 60 | 1 | 2 | 28000 | 1 | 1 | N | 0 | 1 | 0 | 2 | 0.021515 | 2157 | -3.647547 | 1.000000 | 373 |
| 7 | 68769 | 2014-03-17 | SCH | F | 39 | N | N | 0 | 0 | -1.522739 | 1 | 50 | 2 | 3 | 45000 | 3 | 3 | N | 0 | 1 | 0 | 3 | 0.014948 | 2116 | -2.985763 | 1.046594 | 373 |
| 8 | 38424 | 2014-02-14 | SCH | F | 50 | Y | N | 1 | 0 | -1.676061 | 0 | 50 | 1 | 1 | 30000 | 1 | 4 | N | 0 | 1 | 1 | 2 | 0.021910 | 2147 | -3.286394 | 0.948714 | 373 |
| 9 | 4496 | 2014-01-10 | UGR | F | 54 | N | N | 0 | 0 | -2.695176 | 1 | 50 | 2 | 3 | 24000 | 1 | 3 | N | 0 | 1 | 0 | 1 | 0.023142 | 2182 | -5.284658 | 1.082470 | 373 |
Last rows
| client_id | app_date | education | sex | age | car | car_type | decline_app_cnt | good_work | score_bki | bki_request_cnt | region_rating | home_address | work_address | income | sna | first_time | foreign_passport | default | train | car1 | app_month | score_bki_income | days_from_CD | score_bki_region_rating | delta_app | to_young_warn | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 110138 | 16072 | 2014-01-23 | GRD | F | 28 | N | N | 0 | 0 | -1.651781 | 4 | 60 | 1 | 2 | 13000 | 1 | 3 | N | -1 | 0 | 0 | 1 | 0.050749 | 2169 | -2.707837 | 1.202172 | 373 |
| 110139 | 10090 | 2014-01-17 | SCH | F | 53 | Y | N | 0 | 0 | -1.845058 | 2 | 50 | 1 | 2 | 7000 | 1 | 1 | N | -1 | 0 | 1 | 1 | 0.091487 | 2175 | -3.617760 | 1.112914 | 373 |
| 110140 | 90435 | 2014-04-07 | UGR | F | 48 | N | N | 0 | 0 | -2.066300 | 1 | 60 | 1 | 1 | 27000 | 1 | 4 | N | -1 | 0 | 0 | 4 | 0.022900 | 2095 | -3.387376 | 1.063227 | 373 |
| 110141 | 42509 | 2014-02-19 | SCH | F | 58 | Y | Y | 0 | 1 | -1.857117 | 1 | 50 | 2 | 3 | 25000 | 4 | 3 | N | -1 | 0 | 2 | 2 | 0.025568 | 2142 | -3.641406 | 1.056826 | 373 |
| 110142 | 72405 | 2014-03-20 | SCH | F | 40 | N | N | 0 | 0 | -2.039905 | 0 | 50 | 2 | 3 | 20000 | 4 | 1 | N | -1 | 0 | 0 | 3 | 0.031046 | 2113 | -3.999814 | 1.000000 | 373 |
| 110143 | 83775 | 2014-03-31 | SCH | F | 37 | N | N | 1 | 0 | -1.744976 | 3 | 50 | 2 | 3 | 15000 | 4 | 1 | N | -1 | 0 | 0 | 3 | 0.043361 | 2102 | -3.421521 | 1.106789 | 373 |
| 110144 | 106254 | 2014-04-25 | GRD | F | 64 | Y | Y | 0 | 0 | -2.293781 | 3 | 60 | 1 | 2 | 200000 | 1 | 4 | N | -1 | 0 | 2 | 4 | 0.002978 | 2077 | -3.760297 | 1.210563 | 373 |
| 110145 | 81852 | 2014-03-30 | GRD | M | 31 | N | N | 2 | 0 | -0.940752 | 1 | 50 | 1 | 2 | 60000 | 4 | 2 | N | -1 | 0 | 0 | 3 | 0.012181 | 2103 | -1.844612 | 0.971214 | 373 |
| 110146 | 1971 | 2014-01-07 | UGR | F | 27 | N | N | 1 | 0 | -1.242392 | 2 | 80 | 2 | 3 | 30000 | 1 | 1 | N | -1 | 0 | 0 | 1 | 0.023356 | 2185 | -1.533817 | 1.038016 | 373 |
| 110147 | 69044 | 2014-03-17 | SCH | M | 38 | N | N | 0 | 0 | -1.507549 | 2 | 50 | 1 | 2 | 15000 | 4 | 2 | N | -1 | 0 | 0 | 3 | 0.044944 | 2116 | -2.955979 | 1.092259 | 373 |